Comparative Analysis of Outlier Detection Techniques
نویسنده
چکیده
Data Mining simply refers to the extraction of very interesting patterns of the data from the massive data sets. Outlier detection is one of the important aspects of data mining which actually finds out the observations that are deviating from the common expected behavior. Outlier detection and analysis is sometimes known as outlier mining. In this paper, we have tried to provide the broad and a comprehensive literature survey of outliers and outlier detection techniques under one roof, so as to explain the richness and complexity associated with each outlier detection technique. Moreover, we have also given a broad comparison of the various methods of the different outlier techniques.
منابع مشابه
Comparative Study of Incremental Learning Algorithms in Multidimensional Outlier Detection on Data Stream
Multi-dimensional outlier detection (MOD) over data streams is one of the most significant data stream mining techniques. When multivariate data are streaming in high speed, outliers are to be detected efficiently and accurately. Conventional outlier detection method is based on observing the full dataset and its statistical distribution. The data is assumed stationary. However, this convention...
متن کاملStatistical Techniques in Anomaly Intrusion Detection System
In this paper, we analyze an anomaly based intrusion detection system (IDS) for outlier detection in hardware profile using statistical techniques: Chi-square distribution, Gaussian mixture distribution and Principal component analysis. Anomaly detection based methods can detect new intrusions but they suffer from false alarms. Host based Intrusion Detection Systems (HIDSs) use anomaly detectio...
متن کاملOutlier Detection in Wireless Sensor Networks Using Distributed Principal Component Analysis
Detecting anomalies is an important challenge for intrusion detection and fault diagnosis in wireless sensor networks (WSNs). To address the problem of outlier detection in wireless sensor networks, in this paper we present a PCA-based centralized approach and a DPCA-based distributed energy-efficient approach for detecting outliers in sensed data in a WSN. The outliers in sensed data can be ca...
متن کاملOutlier Detection by Boosting Regression Trees
A procedure for detecting outliers in regression problems is proposed. It is based on information provided by boosting regression trees. The key idea is to select the most frequently resampled observation along the boosting iterations and reiterate after removing it. The selection criterion is based on Tchebychev’s inequality applied to the maximum over the boosting iterations of ...
متن کاملA statistical test for outlier identification in data envelopment analysis
In the use of peer group data to assess individual, typical or best practice performance, the effective detection of outliers is critical for achieving useful results. In these ‘‘deterministic’’ frontier models, statistical theory is now mostly available. This paper deals with the statistical pared sample method and its capability of detecting outliers in data envelopment analysis. In the prese...
متن کامل